Programmable Stream Processors

نویسندگان

  • Ujval J. Kapasi
  • Scott Rixner
  • William J. Dally
  • Brucek Khailany
  • Jung Ho Ahn
  • Peter R. Mattson
  • John D. Owens
چکیده

T he complexity of modern media processing, including 3D graphics, image compression, and signal processing, requires tens to hundreds of billions of computations per second. To achieve these computation rates, current media processors use special-purpose architectures tailored to one specific application. Such processors require significant design effort and are thus difficult to change as media-processing applications and algorithms evolve. The demand for flexibility in media processing motivates the use of programmable processors. However, very large-scale integration constraints limit the performance of traditional programmable architectures. In modern VLSI technology, computation is relatively cheap—thousands of arithmetic logic units that operate at multigigahertz rates can fit on a modestly sized 1-cm die. The problem is that delivering instructions and data to those ALUs is prohibitively expensive. For example, only 6.5 percent of the Itanium 2 die is devoted to the 12 integer and two floating-point ALUs and their register files; communication, control, and storage overhead consume the remaining die area. In contrast, the more efficient communication and control structures of a specialpurpose graphics chip, such as the Nvidia GeForce4, enable the use of many hundreds of floating-point and integer ALUs to render 3D images.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring the VLSI Scalability of Stream Processors

Stream processors are high-performance programmable processors optimized to run media applications. Recent work has shown these processors to be more areaand energy-efficient than conventional programmable architectures. This paper explores the scalability of stream architectures to future VLSI technologies where over a thousand floating-point units on a single chip will be feasible. Two techni...

متن کامل

High Speed Motion Estimation Using Stream Processing

Media processing applications are dominant in many new workloads today. These applications including 3D graphics, image compression, and signal processing, require tens to hundreds of billions of computations per second. To achieve real-time requirements, current media processors use special purpose architectures. Because of the changes in media applications and standards, flexibility is an imp...

متن کامل

Improving Power Efficiency in Stream Processors Through Dynamic Cluster Reconfiguration

Stream processors support hundreds of functional units in a programmable architecture by clustering functional units and utilizing a bandwidth hierarchy. Clusters are the dominant source of power consumption in stream processors. When the data parallelism falls below the number of clusters, unutilized clusters can be turned off to save power. This paper improves power efficiency in stream proce...

متن کامل

Real-time Ray Tracing on Programmable Graphics Hardware

Recently a breakthrough has occurred in graphics hardware: fixed function pipelines have been replaced with programmable vertex and fragment processors. In the near future, the graphics pipeline is likely to evolve into a general programmable stream processor capable of more than simply feed-forward triangle rendering. In this paper, we evaluate these trends in programmability of the graphics p...

متن کامل

Cg in Two Pages

The latest real-time graphics architectures include programmable floating-point vertex and fragment processors, with support for data-dependent control flow in the vertex processor. We present a programming language and a supporting system that are designed for programming these stream processors. The language follows the philosophy of C, in that it is a hardware-oriented, generalpurpose langua...

متن کامل

Reconfigurable stream processors for wireless base-stations

The need to support evolving standards, rapid prototyping and fast time-to-market are some of the key reasons for desiring programmability in future wireless base-stations. However, supporting highly complex signal processing algorithms for multiple users at high data rates (in Mbps), requiring billions of operations per second, while providing power efficiency present challenges in attaining t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IEEE Computer

دوره 36  شماره 

صفحات  -

تاریخ انتشار 2003